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Abstract 

For distributed systems at large and e-commerce systems in particular, ratings play an increasingly 
important role. Ratings confer reputation measures about sources. This paper reports our formalization 
of the rating process. This paper argues that ratings should be context- and individual- dependent 
quantities. In contrast to existing rating systems in many e-commerce or developer sites, our approach 
makes use of personalized and contextualized ratings for assessing source reputation. Our approach is 
based on a Bayesian probabilistic framework. 

1. Introduction 

In a networked environment where resources are distributed widely, locating relevant resources has 
been the subject of the information retrieval industry. However, reliability measures for the retrieved data 
are often missing. ' This paper reports our progress on the development of a mathematical scheme and 
computational tools for how approval ratings can be propagated and assimilated for the purpose of 
assigning reputation values to other agents. Degrees of approval are modeled in a probabilistic 
framework. We adopt the following definition for reputation: 

• Reputation: perception that an agent has of another's intentions and norms. 

Discussion on the meaning of this definition can be found in [Mui, et ah, 2002], along with how 
reputation relates to concepts such as trust: 

• Trust: a subjective expectation an agent has about another's future behavior based on the history of 
their encounters. 

We argue that the failure of existing rating systems is partially due to the lack of personalization and 
contextualization in these systems. For example, consider an expert in French cuisine browsing the 
Amazon.com catalogue, she comes across a French cookbook with a 3 stars rating (as provided by other 
raters). The average Amazon user is not likely to be a French cuisine expert. For our expert, the three 
star rating is therefore of limited value. Unless the rating system can summarize the ratings of those who 
are similar to the individual seeking a rating, the rating system is destined to provide noisy ratings. This 
example points to the importance of personalizing ratings for those seeking source sanctioning. Similarly, 
a user's reputation as a computer scientist is likely to be different from his or her reputation as a violinist. 
Existing systems such as those in eBay and Slashdot do not take into account this context dependency. 
Section 2 covers the relevant literature. Section 3 formalizes intuitions concerning ratings. Section 4 
concludes this paper with thoughts on future works. 

2. Background 

The information retrieval (IR) community has been tackling the question of information relevance 
for decades now, and with remarkable success (Saltan, 1971; Kleinberg, 1997; Brin and Page, 1998). 
However, simply locating relevant resources in distributed systems today is no longer adequate. Users 
demand not only relevant but also reliable information. 

Collaborative filtering is a paradigm for using the collective experiences of a cyber community to 
select relevant resources (Goldberg, et ah, 1992). Large numbers of users are often required before 



1 Although different in other contexts, reliability here is taken to confer source reputation. The intuition used here is 
that the more reputed is a source, the more reliable is information gathered from it. 



reasonable performance is achieved. No metric exists to measure how reliable is a rating nor how reputed 
is a rater. The recent human-computer interface (HCI)'s interest in "social navigation" highlights the 
importance of leveraging collective experience in aiding users interacting with information systems 
(Dieberger, et al., 2000; Wexelblat, 2001). However, techniques for assessing the reliability of the 
collected data are ad-hoc. 

Prompted by the rise of on-line communities and their proliferation in the past decade, rating 
systems have been built to provide source reputation or reliability assessment. On-line auction eBay 
provides a +1/-1 rating system for buyers to provide feedback about sellers and vice versa. A few recent 
high profile fraud cases in eBay by individuals with high eBay ratings suggest that the company should 
seriously consider enhancing their simple rating system. Rating systems in such commercial sites 
aggregate all users' ratings equally and ignore the personalized nature of reputation {i.e., one's reputation 
means different things to different individuals depending on the context). We term this type of reputation 
measure "global" in nature - as opposed to being personalized for the inquirer's context. 

In the computer science literature, Zacharia and Maes ( 1 999) have suggested that reputation in an 
on-line community can be related to the ratings that an agent receives from others, and have pointed out 
several criteria for such rating systems. However, their mathematical formulation for the calculation of 
reputation can at best be described as intuitive - without justifications except the intuitive appeal of the 
resulting reputation dynamics. Ad-hoc formulation plagues several other proposals for rating or 
reputation systems such as those in Glass, et al. (2000), Yu and Singh (2000), Rouchier, et al. (2001), 
Sabater, et al. (2001), Esfandiari, et al. (2001), among others. Nevertheless, ratings and reputation 
measures have been found to provide useful intuition or services for users of these systems. 

3. Conceptual Framework 

On notations, consider the universe of users and objects in the system as the following sets: 

Set of agents: A= { a h a 2 , ... a M } Set of objects: O = { o h o 2 , ... o N } 

Users rate the objects, which could be other users {i.e., A c O). The set of ratings p can be viewed as a 
mapping from A X O to a real number between and 1 (to accommodate probability). 

Rating: /> : AxO ->[0, 1] 

To model the process of opinion sharing among agents, the concept of an encounter is required. An 
encounter is an event between two different agents {a b a/) such that the query agent a, asks the responding 
agent a,- for a/s rating of an object: 

Encounter: ds D = A 2 xOx[0, 1] U { 1} 

where {.L} represents the set of no encounter ("bottom"). 

Consider an agent a, who has never interacted with object Ok in the past. Before taking the trouble 
of interacting with o k , a t asks other agents (A.,) in the environment their ratings for o k . a, will decide to 
interact with object o k , for example, if the weighted sum of the ratings (a proxy for o/s reputation) from 
agents in A.,- is greater than a certain threshold. The reputation ofay in a,'s mind can be considered as the 
probability that in the next encounter, a, will approve of a/s rating for the object context. A context 
describes a set of attributes about an environment. Let an attribute be defined as the presence ('1') or 
absence ('0') of a trait. The set of all attributes is possibly countably infinite, and is defined as follows: 

Attribute: beB and b : O -> { 0, 1 } Set of attributes: B = { b h b 2 , . . . } 

A context is then an element of the power set of B: 

Context: c e C and C = P{B} 
where P{.} represents the power set. The reputation mapping can now be represented by: 



Reputation: R : AxAxC ->[ 0, 1 ] 

This mapping assumes that reputation only changes after an encounter - either directly with the individual 
involved, or through rating feedbacks from other agents. After each encounter, the update rule for the 
state of the system can be represented as below (which is discussed in the next section): 

Update rule: newstate: R x D —> R 

3.1 Bayesian Reputation Inference 

Let x a b{i) be the indicator variable for agent a's approval of b after the f encounter between them - 
thus the ratings p is restricted to binary values of or 1 . If a and b have had n encounters in the past for 
context c, the proportion estimator B c for the number of approvals of b by a can be modeled with a Beta 
prior distribution (Dudewicz and Mishra, 1988): 

p{6 c ) = Beta{c { ,c 2 ) 

where c } and c 2 are parameters determined by prior assumptions and 9 C is the estimator for the true 

proportion of a's approval of b based on encounters between them for context c. (From this point on, a 
given context c is assumed and the subscript is avoided for clarity.) Assuming that each encounter's 
approval probability is independent of other encounters between a and b, the likelihood of having p 
approvals and (n - p) disapprovals can be modeled as: 

L(D | 9) = 9"(l-ey-" where D = { x ah {\\ x ab (2),... x ab (n) } 

Combining the prior and likelihood, the posterior estimate can be shown to be (Mui, et ah, 2001): 

p{8 | D) = Beta(c { + p,c 2 +n- p) 

In our framework, reputation for b in a's mind is a's estimate of the probability that a will approve 
of b in the next encounter x(n + 1). This estimate is based on n previous encounters between them (for 
the specific context c). This estimate can be shown to yield the following (ibid.): 

p(xJn + l) = l\D) = E[0 i \D] = r ah (1) 

This conditional expectation is the expression for r ah : the update rule specified in the last section. First 
order statistical properties of the posterior are summarized below for the posterior estimate of : 



e[o\d\ = 



c t +p _ 2 (c x +p)(c 2 + n-p) 



a 



c l +c 2 + n '" (c, +c 2 + n-l)(c, +c 2 + nf 



which indicates an easy implementation for the estimation procedure. Note that r ah in (1) is a 
personalized estimate by agent a, which depends on the set of encounters D that agent a takes into 
account about b. Specifying the set of encounters is individual dependent and could include inferred 
encounters from hearsays from agent a's neighbors, as suggested in the propagation procedures below. 
We also have not specified how prior belief about 6 is formed? Two situations are considered below. 

3.2 Complete Strangers: Prior Assumptions 

If agents a, and a, are complete strangers, an ignorance assumption is made. When these two 
strangers first meet, their estimate for each other's reputation should be uniformly distributed across the 
reputation's domain: 

< 6 < 1 

otherwise 




For the Beta prior, values of C/=l and C2=\ yields such a uniform distribution. 

3.3 Known Strangers: Prior Assumptions 

If a, and a k have never met before but a, knows a, well (i.e., a, has an opinion on a/s reputation). 
Also, aj knows a k well, a, would like to estimate a,t's reputation based on a, 's history of encounters with 
a,-, and ay's history of encounters with a k . This setup is depicted below: 




[Ok) 

Figure 1. Illustration of an indirect neighbor pair (a b a^ 

For agent a h agent a k is a stranger but a k is not completely unknown to a, since a, knows a y who has 
an opinion about k. In a future encounter between a, and a k , the probability that this encounter will be 
rated good by a, is: 

Let Dij„ = all n encounters between a, and a,. 

Djkm = all m encounters between a, and a k . 

Xtj(n) = indicator variable for a,'s approval of a k at encounter n 

p{x a (n + l) = l\D iJ ,D Jk ) 
= p(x t (n + l) = 1 1 D ij )-p{x jk (n + \) = 1 1 D Jt ) + 

[\- p{x il (n + \) = \\D ii )\[\- p{ Xjk (n + \) = \\D ik )\ 

The interpretation of this equation is that the probability that a, would approve a k at encounter n + 1 
is the sum of the probabilities that both a, and ay agree (about the context that c is in) and that both of 
them disagree. 

3.4 Inference Propagation 

So far, we have discussed belief estimation for indirect neighbors which are one degree away from 
a direct neighbor. The reputation of indirect neighbors which are further away can be estimated by 
recursively applying Equation (2) using closer neighbors as direct neighbors. This procedure works if 
there exists a single path between two agents. This recursion has a fixed point for a finite population 
community: when all members of the communities have been rated, the recursion stops. 

If there are multiple paths between two agents, as illustrated in Figure 2, a different procedure is 
required. 

Chain 1 




Chain 1 f\ 



Chain 2 



/~N Chain k 

O^Q....... 

Figure 2. Illustration of a parallel network between two agents a and b. 




Figure 2 shows a parallel network of k chains between two agents of interest, where each chain 
consists of at least one link. Agent a would like to estimate agent 6's reputation as defined by the 
embedded network between them. Clearly, to sensibly combine the evidence about b through all 
intermediate agents in between, measures of "reliability" about these intermediates are required. One way 
these measures can be calculated is shown in Mui, et al. (2002) by using Chernoff bound on the parameter 

estimate . Let m represents the minimum number of encounters necessary to achieve a given level of 

confidence Sfor and error bound e, the following inequality can be demonstrated: 

m>-^Tln((l-£)/2) 

As £ approaches 1, a larger m is required to achieve a given level of error bound £. This threshold (m) can 
be set on the number of encounters between agents such that a reliable measure can be established as 
follows: 



/// 



ah 



if m, < m 



l ab 

m 

1 otherwise 

where m a b is the number of encounters between agents a and b. Intuition of this formula is as follows: m 
represents the minimum sample size of encounters to reach a confidence (and error) level about the 
parameter estimates. Above a given level of sample size, the estimator is guaranteed to yield the 
specified level of confidence. Therefore, such an estimate can be considered as "reliable" with respect to 
the confidence specification. Any sample size less than the threshold m is bound to yield less reliable 
estimates. As a first order approximation, a linear drop-off in reliability is assumed here. 

For each chain in the parallel network, how should the total weight be tallied? Two possible 
methods are plausible: additive and multiplicative. The problem with additive weight is that if the chain 
is "broken" by a highly unreliable link, the effect of that unreliability is local to the immediate agents 
around it. In a long social chain however, an unreliability chain is certain to cast serious doubt on the 
reliability of any estimate taken from the chain as a whole. On the other hand, a multiplicative weighting 
has "long-distance" effect in that an unreliable link affects any estimate based on a path crossing that link. 

The form of a multiplicative estimate for chain f s weight (w,) can be: 

/,. 

w j = M W-. where < i < k 

7=1 

where /, refers to the total number of edges in chain i and Wy refers to they" 1 segment of the i' chain. 

Once the weights of all chains of the parallel network between the two end nodes are calculated, 
the estimate across the whole parallel network can be sensibly expressed as a weighted sum across all the 
chains: 

k 

r ab=Yu r ab^ W i 
i=\ 

where r ab (i) is a's estimate of b's reputation using path i and w i is the normalized weight of path i 

( w t sum over all i yields 1). r ah can be interpreted as the overall perception that a garnered about b using 
all paths connecting the two. 

4. Discussion and Conclusion 

This paper has briefly reported our Bayesian formalization of the rating process. We have argued 
that since ratings and reputations are clearly context dependent quantities, models about them should 
explicitly take the context into account. Applications of this formalization have been applied to simulated 
and real world ratings and are reported in Mui, et al, 2001. Our Bayesian approach achieves more 



accurate reliability and reputation estimates than control schemes - similar to those used in existing 
systems such as eBay and Amazon. 

There are several yet to be resolved issues. We have examined binary ratings in this paper. 
Extending the Bayesian framework to other types of ratings should be straightforward. Specifying the 
procedure for inferring ratings and reputations from one context to another is also needed. We have 
started investigating the use of ontologies to relate different contexts. A hard problem is to determine 
how to resolve the different ontological views of the world held by different agents. Furthermore, the 
metric or function to transfer rating or reputation from one context to the next is yet to be worked out. In 
our formulation, the calculation of an agent's reputation requires the disclosure of detailed personal rating 
and object descriptions to other agents. This creates a privacy concern which needs to be addressed. 
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